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Superintendents from districts in the Minority Student Achievement Network (MSAN) challenged 
the Strategic Education Research Partnership (SERP) to identify an approach to narrowing the 
minority student achievement gap in Algebra 1 without isolating minority students for intervention. 
SERP partnered with 8 MSAN districts and researchers from 3 universities to design and rigorously 
test AlgebraByExample, a set of 42 Algebra 1 assignments with interleaved worked examples that 
target common misconceptions and errors. In a year-long random-assignment study, students who 
received AlgebraByExample assignments had an average 7 percentage point boost on a posttest 
containing released items from state assessments, and students in the bottom half of the 
performance distribution where minority students are disproportionately concentrated had an 
average 10 percentage point boost on a researcher-designed assessment of conceptual 
understanding. AlgebraByExample is easily incorporated into any existing curriculum, and 
naturally serves as a launch point for mathematically rich discussion. 


The Minority Student Achievement Network (MSAN) is a growing network of 29 suburban and 
small urban districts committed to eliminating the achievement gaps between their White and 
Asian students and their African American and Latino students (who range in number from 
20% to 83% of the school population). The network was self-organized to accelerate learning 
in literacy and math and improve college-going rates by learning from each other and from 
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research. The course Algebra 1 was of particular concern to these districts because it plays an 
important gatekeeper role with regard to higher-level, college-preparatory mathematics and 
because the districts were not making progress in accelerating the learning of their African 
American and Latino students, despite the districts’ focus on improving curriculum and instruc¬ 
tion to incorporate the current best thinking. The network leaders looked to researchers for new 
ideas. 

MSAN recognized that for practitioners with full workloads, the demands of forming and 
managing a productive research-practice partnership are unrealistic. Bridging the worlds of 
research and practice requires identifying and recruiting researchers who have both the exper¬ 
tise and interest in collaborative, problem-solving research and development, then launching 
and managing a productive program of work. When the Strategic Education Research Partner¬ 
ship (SERP) was formed to play precisely this role, MSAN leaders saw an opportunity. 

SERP, an independent, nonprofit organization incubated at the National Academy of Scien¬ 
ces, was established with a mission to conduct education research and development in Pasteur’s 
quadrant (Stokes, 1997), where new knowledge is generated in the interest of solving important 
problems of practice. SERP work is carried out at field sites where partnerships between school 
district practitioners and interdisciplinary research and design teams are created, nurtured, and, 
funding permitting, sustained. Participating districts define the focal problem, and SERP- 
recruited researchers help to frame the problem with respect to knowledge from prior research. 
Researchers and practitioners collaborate at every stage so that the work is not only research- 
informed, but also reality-checked. MSAN and SERP formed a partnership in 2(X)6 to address 
the Algebra 1 challenge. 


FOCUS ON ALGEBRA 1 

Although SERP work responds to the needs of district partners, expectations are established at 
the outset regarding the ultimate goal of producing knowledge and tools that are of benefit not 
only to the partner district, but to the field of education more broadly. The Algebra 1 achieve¬ 
ment gap certainly meets that criterion. Substantial discrepancies in the mathematics achieve¬ 
ment levels of students by race and ethnicity persist, despite increased attention to the issue. 
Results from the most recently available National Assessment of Educational Progress revealed 
that the Black-White mathematics achievement gap for eighth graders, a grade in which Alge¬ 
bra 1 is often taught, was 31 points in 2007 (Vanneman, Hamilton, Anderson, & Rahman, 
2009). In 2(X)9, the achievement gap between Latino and White students was 26 points, and 
was even greater when comparing White students to Latino English language learners (Hemp¬ 
hill & Vanneman, 2011). 

Algebra 1 can be particularly challenging not only because it introduces more abstract repre¬ 
sentations and more complex mathematical relationships, but also because it can magnify mis¬ 
conceptions that have their roots in earlier instruction. Further, given the age at which students 
first take an Algebra 1 course, leaders in the MSAN districts feared that emerging self-con¬ 
sciousness could lead students who perform poorly to perceive themselves as incompetent in 
the domain, and become less mastery-oriented and more avoidance-motivated (Harter & Con¬ 
nell, 1984). Although the drive to avoid failure is not uncommon in any domain, students who 
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select themselves out of mathematics courses at the earliest possible opportunity relinquish 
access to a broad range of higher education and career opportunities. 


THE APPROACH 

SERP recruited an interdisciplinary team of researchers to work with practitioners from an ini¬ 
tial subset of five MSAN districts to frame the problem more precisely and define an approach 
to addressing it. Mathematics researchers and program designers were selected to bring a 
diverse set of ideas to bear on potential approaches to addressing the problem; among them 
were contributors to the development of four different, innovative mathematics curricula. 

In accordance with SERP protocols, superintendents and instructional leadership in the par¬ 
ticipating districts defined the parameters for the work. On the basis of decades of experience, 
the district leaders set three constraints: 

1. A supplemental approach introduced in afterschool hours or during the summer was 
undesirable because, in the superintendents’ experience, such programs are the first 
to disappear during budget cuts. 

2. An entirely new curriculum was not an option because both the political and financial 
costs of such a decision would be high, and teachers’ focus would be directed to the 
demands of the new curriculum rather than to struggling students. 

3. The problem could not be addressed by explicitly or uniquely targeting minority stu¬ 
dents or struggling students because this would reinforce any tendency for these stu¬ 
dents to self-identify as being “not good at math.” Empirical work has demonstrated 
that such targeting can increase minority students’ stereotype threat, or “risk of con¬ 
firming a negative stereotype about one’s group,” with inimical consequences for 
performance attainment (Steele & Aronson, 1995, p. 797). The districts wanted an 
approach that would target all students, even if it was intended to be particularly ben¬ 
eficial for a subset of the population. 

SERP balances the interests of practitioners with the interests of researchers; approaches to a 
problem must be based in learning principles for which there is research evidence, and be tested 
using rigorous methodologies. And SERP requires that any solution to a problem be scalable. If 
scalability requires that teachers internalize a change and take ownership (Coburn, 2003), then 
it is particularly important to design and test to ensure that teachers see the approach as both 
worthy and workable in their classrooms. 

The SERP-recruited researchers and district practitioners thought through approaches to 
improvement that fit within these multiple constraints. As is often the case, those who are fur¬ 
ther down in the district hierarchy brought insights about obstacles to improvement that were 
encountered closer to the ground and were not seen by the most senior district leaders. District 
math coordinators emphasized that top-down directives requiring teachers to substantially 
change the way they teach were not going to work because algebra teachers in their districts 
believe they know more about mathematics teaching and about their students than anyone in 
the district’s central office. Furthermore, they argued that teachers would resist making changes 
that diminish their sense of control, a phenomenon documented in field research by Mary 
Kennedy (2(X)5). 
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A math coordinator in one of the districts recounted a recent experience in which a practice 
in which teachers showed no interest was introduced in summer school where the stakes are 
lower, and later spread into the school year. He argued that the likelihood of success in improv¬ 
ing Algebra 1 learning would be highest if a back door was identified—an unobtrusive 
approach that would demonstrate the benefit of a new practice without having the primary rou¬ 
tines of current practice upended. Ken Koedinger, a SERP-recruited researcher from Carnegie 
Mellon University, proposed an approach that he and his colleagues found to be powerful in 
experimental research: providing students with worked examples and self-explanation prompts 
(cf. Pashler et al., 2007). Worked examples were especially appealing because they could be 
easily introduced through assignments. This approach would not require significantly disrupting 
a teacher’s instructional routines, but has the potential to increase the efficacy of teacher and 
student work. The approach was not intended to be teacher-proof. In fact, the hope was that 
teachers would review assignments as a classroom activity, and thus be exposed to students’ 
thinking and to the value of reasoning through worked examples. 

Over the next 7 years, SERP staff members and researchers collaborated with practitioners 
across eight MSAN districts to iteratively develop and test AlgebraByExample, a set of 42 
math assignments that incorporate powerful research findings regarding student misconceptions 
and the value of self-explanation. 


ALGEBRABYEXAMPLE RESEARCH BASE 

Many students enter the Algebra 1 classroom holding misconceptions that have the strong 
potential to derail new learning (Brown, 1992; Chiu & Liu, 2004; Kendeou & van den Broek, 

2005) . Within the domain of equation-solving alone, a number of misconceptions have been 
identified as critical, including the idea that the equals sign is an indicator of operations to be 
performed (Baroody & Ginsburg, 1983; Kieran, 1981; Knuth, Stephens, McNeil, & Alibali, 

2006) ; that negative signs represent only the subtraction operation and do not modify terms 
(Vlassis, 2(X)4); that subtraction is commutative (Warren, 2003); and that variables cannot take 
on multiple values (L. R. Booth, 1984; Knuth et al., 2006; Kiichemann, 1978). Not surprisingly, 
such misconceptions have been shown to affect students’ success in problem solving and hinder 
their learning of new material (J. L. Booth & Koedinger, 2(X)8). For many students, these mis¬ 
conceptions persist even after traditional classroom instruction on the relevant topic (J. L. 
Booth, Koedinger, & Siegler, 2007; Vlassis, 2(X)4). A persuasive body of evidence from a vari¬ 
ety of content areas suggests that dislodging misconceptions can be devilishly difficult, and 
often requires directly drawing out and confronting the flawed thinking head on (Donovan & 
Bransford, 2005). 


Worked Examples With Self-Explanation 

Worked examples —mathematics problems with worked-out solutions provided—offer an 
opportunity to call students’ attention to common misconceptions. Some textbooks provide 
one, or a small number, of worked examples at the beginning of a problem set; students are sim¬ 
ply asked to study a problem solution rather than solve a problem themselves. Assignments with 
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interleaved worked examples, in contrast, embed the worked solutions throughout a problem 
set, alternating between examples and problems for students to solve. Positive effects of inter¬ 
leaving worked examples have been reported in a variety of courses (Clark & Mayer, 2003), 
including algebra (Sweller & Cooper, 1985). Replacing many of the problems in a practice ses¬ 
sion with examples of how to solve a problem leads to the same amount of procedural learning 
in less time (Clark & Mayer, 2003; Zhu & Simon, 1987), or increased learning and transfer of 
knowledge in the same amount of time (Paas, 1992). Worked examples of problem solutions 
are thought to be more efficient for learning new tasks because they reduce the load in working 
memory (compared with completing long strings of practice problems), thereby allowing stu¬ 
dents to learn the steps in problem solving (Sweller, 1999). 

The value of worked examples can be enhanced with self-explanation prompts. Numer¬ 
ous empirical results have supported the notion that self-explanation is beneficial for learn¬ 
ing (see Chi, 2000, for a review). When individuals self-explain, they integrate various 
pieces of knowledge (either from the instructional material, their own prior knowledge, or 
a combination of the two), generate inferences to fill gaps in their own knowledge, and 
make explicit the new knowledge and the connections that they’ve generated (Chi, 2000; 
Roy & Chi, 2005). High-ability students tend to spontaneously self-explain more often 
than low-ability students (Chi, Bassock, Lewis, Reimann, & Glaser, 1989), but students of 
any level who are prompted to self-explain learn more than those who do not self-explain 
(Chi, de Leeuw, Chiu, & Lavancher, 1994). 

Empirical laboratory studies have shown that asking students to explain the errors in incor¬ 
rect solutions, as well as the successful strategy in correct solutions, leads to greater learning 
than explaining correct solutions only (Durkin & Rittle-Johnson, 2009; Grosse & Renkl, 2004, 
2007; Rittle-Johnson, 2(X)6; Siegler, 2002; Siegler & Chen, 2008). These studies suggest that 
two of the main mechanisms by which incorrect examples improve learning are providing nega¬ 
tive feedback and causing cognitive conflict. Negative feedback reduces the relative strength of 
incorrect strategies, which, when coupled with correct examples or instruction on correct proce¬ 
dures, is even more likely to cause procedural improvement. Cognitive conflict forces students 
to see the differences between the presented problem and others where a procedure does work, 
which in turn strengthens the likelihood of causing conceptual improvement. Recent work sug¬ 
gests that incorrect examples may be even more important than correct examples for promoting 
conceptual understanding (J. L. Booth, Lange, Koedinger, & Newton, 2013). 

Incorrect examples can be used to target common misconceptions that make solving a partic¬ 
ular type of problem difficult. For example, students may commonly use problem-solving strat¬ 
egies that produce right answers in some situations (e.g., combine two terms by adding the 
numbers involved; Ax + 3a; is lx), yet harbor misconceptions (about the nature of variable vs. 
constant terms) that become apparent when they attempt to generalize this strategy to problems 
where it is not appropriate (e.g., 4a; + 3 is not 7a;). When students study and explain incorrect 
examples, they directly confront these faulty concepts. They are less likely to acquire or main¬ 
tain misconceptions because they have identified what is wrong, and explained why it is wrong 
(Ohlsson, 1996; Siegler, 2002). 

Many studies have established the benefits for procedural knowledge of worked examples 
(e.g., Sweller & Cooper, 1985; Zhu & Simon, 1987), and the benefits for conceptual under¬ 
standing of self-explanation (e.g., Chi, 2000). However, few studies were conducted in class¬ 
room settings with real teachers teaching their own students. The studies in most publications 
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consisted of single classroom lessons. Further, although this research-backed approach had been 
recommended for instructional use by the U.S. Department of Education (Pashler et al., 2007), 
worked examples are not now a common feature of mathematics textbooks or other classroom 
resources. 


MAKING WORKED EXAMPLES WORK IN THE CLASSROOM 

In the SERP-MSAN partnership, as in all SERP partnerships, once the problem identification 
and framing phase has produced a direction for the work, a smaller research and development 
team is formed with the appropriate expertise to execute the plan. With the identification of 
worked examples as a viable approach to improving Algebra 1 outcomes, Ken Koedinger and 
Julie Booth became lead researchers. A working group was formed that included a mathemat¬ 
ics coordinator and two Algebra 1 teachers from each of the participating school districts. 
In early meetings, the group explored findings from the research literature, and compared 
Algebra 1 assignments from 11 MS AN classrooms. The typical assignment contained 10 to 
12 problems to solve. Of the 128 items across all of the assignments, only six requested an 
explanation and only one provided a worked example of a problem solution. Similarities and 
differences in algebra textbooks, in mathematics instruction, and in culture across the dis¬ 
tricts, and opportunities and constraints relevant to conducting a research project were also 
explored. Math coordinators and participating teachers had come to feel that this research 
process was truly collaborative and that teachers were being listened to in the planning meet¬ 
ings. As a result, they succeeded in persuading teachers in all of the districts to engage in a 
random assignment study. 


Developing the Assignments 

The initial assignments with interleaved worked examples were drafted by the research team to 
target concepts and misconceptions that were identified as problematic for students based on 
the research literature, the prior findings of the research team, and the experience of the practi¬ 
tioner codevelopers. The practitioner codevelopers then reviewed, revised, and, in some cases, 
reshaped the assignments so that the language and level of challenge was appropriate to the stu¬ 
dent population. An initial bank of 24 Algebra I assignments was produced and tested for 
usability and feasibility with a subset of teachers across five MSAN districts during the 2008- 
2009 school year. Results of a pilot study with three classrooms of students (n = 51) using four 
assignments were extremely promising. Half of the randomly chosen students (n = 26) in each 
participating class completed four AlgebraByExample assignments over the course of their 
chapter on equation solving; the other half (n = 25) completed control assignments. Students 
also completed a pretest and posttest of conceptual and procedural knowledge. Results revealed 
that students who received AlgebraByExample assignments improved more than students who 
received control assignments. The effect was even more pronounced for minority students and 
for low-achieving students. Though an appreciable achievement gap was found at pretest (63% 
vs. 72% correct), no difference was found between groups at posttest (both Caucasian and 
minority students answered 72% of problems correctly; J. L. Booth et al., in revision). 
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Teachers who used the AlgebraByExample assignments indicated that the assignments 
prompted interesting classroom discussions about students’ misconceptions; teachers also felt 
their students learned more from these assignments than from typically designed control assign¬ 
ments. Collaborating teachers who asked individual students to think aloud while working with 
an assignment also noted that the examples both caused students to confront their misconcep¬ 
tions and helped them figure out how to solve other problems in the assignment. 

Focus-group teachers provided suggestions on how the approach could be more easily imple¬ 
mented and made more useful for improving student learning. First, the teachers indicated that 
the scope of individual assignments should be tightened so each covered content that would typ¬ 
ically be taught within a single lesson. Teachers were also concerned that individual misconcep¬ 
tions or common errors were typically only covered once in the whole set of assignments. They 
suggested that it would be better to have a higher dosage of assignments on key topics, so that 
misconceptions and common errors could be targeted. Based on the student learning results and 
teachers’ embrace of the assignments approach as a whole, the SERP team concluded that the 
approach was promising, but that further development of the materials was necessary before an 
efficacy study was warranted. 

Refining the Assignments 

In summer 2010, SERP was awarded a 3-year development grant from the Institute of Educa¬ 
tion Sciences, U.S. Department of Education, to further develop and test the assignments, and 
to explore the role of motivation as a mediator of impact. In year 1, the number of assignments 
increased from 24 to 42, and the number of items on the assignments shortened from 12 items 
to 6-8 items. The content of each assignment was narrowed to better fit within a single lesson 
plan. Pilot studies were conducted with three or four assignments done in class to determine 
feasibility, and to gather data from small samples of students. Following the single unit studies, 
nine major algebra topics were agreed upon by the researchers and practitioners, and the content 
within each unit was adjusted to fit a wider range of curricula. Specific items were also refined 
to better address critical misconceptions. 

In 2011-2012, a series of double-unit studies containing six to 14 assignments were com¬ 
pleted across six districts. Results indicated that with more assignments used, AlgebraByExam¬ 
ple led to improved conceptual scores and equivalent procedural scores. In addition, non-Asian 
minority students benefited more from AlgebraByExample even when prior ability was con¬ 
trolled for, as measured by the pretests. In preparation for the year-long study, items were fur¬ 
ther refined and assignments were reformatted and compiled into a spiral-bound workbook to 
ensure ease of use throughout the school year; this also reduced the danger of data loss because 
teachers distributed and collected workbooks each time they were used. 

Testing the Assignments 

In the 2012-2013 school year, AlgebraByExample was tested in 28 classrooms taught by 12 
teachers of nonhonors Algebra I from five MSAN school districts across five states. Individual 
classrooms were randomly assigned to ensure that each participating teacher had at least one 
treatment and one control classroom to control for teacher variability. One teacher (3 
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classrooms) did not complete the study; thus the data for those students were excluded from all 
analyses. The final sample consisted of 380 students (189 experimental, 191 control; 47% boys; 
50% low socioeconomic status) in 25 classrooms (13 treatment classrooms; 12 control class¬ 
rooms). The ethnicity breakdown was: 30% White, 39% Black, 18% Hispanic, 7% Asian, and 
6% biracial. Students were classified as underrepresented minority (URM; Black, Hispanic, 
biracial) or non-URM students (White, Asian); 63% of the students were classified as URM. 
The entire study was conducted in a typical course setting, with testing done as part of normal 
classroom activities and assignments administered by teachers as routine class work. 


MEASURES 


Prior Knowledge 

Students’ prior knowledge was assessed at pretest using a paper-and-pencil test consisting of 81 
items. The measure was designed based on the views of collaborating teachers about the preal- 
gebraic knowledge they would ideally want their Algebra I students to have at the start of the 
year. The percentage of prior knowledge items answered correctly was computed for each stu¬ 
dent. Sample items can be found in Table 1. 

Motivation 

Student motivation was assessed at pretest with brief, age-adapted versions of two well-established 
measures in the achievement motivation literature: Interest (Elliot & Harackiewicz, 1996) and com¬ 
petence expectancy (Elliot & Church, 1997). Students provided responses on Likert-type scales 
from 1 (no, not at all) to 7 (yes, definitely), and scores for each measure were calculated according 
to previously established guidelines (Elliot & Church, 1997; Elliot & Harackiewicz, 1996). 

Conceptual and Procedural Knowledge 

We operationally define conceptual knowledge as an understanding of the core features in prob¬ 
lems for a given topic, and procedural knowledge as the ability to carry out procedures to solve 
problems in that topic (e.g., J. L. Booth, 2011). Conceptual and procedural knowledge were 
assessed using a single paper-and-pencil test consisting of 66 items (41 conceptual, 25 proce¬ 
dural). The percentage of conceptual items answered correctly and the percentage of procedural 
items answered correctly were computed for each student. Sample items can be found in Table 1. 


Standardized Test Items 

Students were administered the paper-and-pencil test utilized by J. L. Booth, Barbieri, Eyer, and 
Pare-Blagoev (2014). The test consisted of 10 algebra-related released items taken from the five 
standardized tests used by the participating districts: Ohio Achievement Test—Grade 8 (3 items; 
Ohio Department of Education, 2006); Standards of Learning Test—Grade 8 Mathematics (1 
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TABLE 1 

Sample Test Items 

Assessment _ Sample Items 


Prior 

Slate whether each of the following is equivalent to x + 4 - 2 + x: 

Evaluate each expiation tor the value# x = 2.y“ 

3. and: “4 

knowledge 

a (x + 4)-(2 + x) 

V'es No 

a) Sx+y-z 



b. 4-*x-2 + x 

Yes No 




c. x -* (4 - 2) + x 

Yes No 




d. x~4 -x + 2 

Yes No 

b) x-3(v+z) 



e (x + 4)+(-2 + x) 

Yes No 




f. X-4+X-2 

Yes No 




g. x-2(2 - l) + x 

Yes No 

c) - + 2y 
z 


Conceptual 

State whethr each of the fof lc*»mg is trie fer the quadra ti c Ijnchco y = ~x 1 ~lx*3 

State how many term* the resulting expression will lave 

posttest 

a The axis ot syrnmtry is* = 1 

Yet No 

a 2 j( 3 ♦ 4x) 

2 3 4 


b Thevstw B a mmmirr 

Yes No 

b (2x*-3)4x-l) 

2 3 4 


c Thevertexss(-1,4) 

Yes No 

c. (x+ 2 f 

2 3 4 


d The vertex e(-1,0) 

Yes No 

d (x* 4X»-4) 

2 3 4 


e The vertex is (4.-1) 

Yes No 

c 2(?*‘-4x*-l) 

2 3 4 




f (2r J + 3)4x-l) 

2 3 4 

Procedural 

posttest 

Simplify each expression using only positive exponents. 

a c-W 

Factor each expression completely. 

a 3x 3 -18x 



b («V) 3 


b -tix 4 -3x*+9x J 


Standardized 

items 

What i?> the slope of the line containing tlx: points (-2, 5) and (1, 

-7)? 

a. -4 

b. -2 

c. 2 

d. 4 

Which is one \aluc of the set of x that makes the following 
true? 

7x + 3> 17 

n. 0 

b. 1 

c. 2 

d. 3 


item; Virginia Department of Education, 2008), Illinois Standards Achievement Test—Grade 8 
Math (3 items; Illinois State Board of Education, 2009), Wisconsin Knowledge and Concepts 
Examination—Grade 8 Mathematics (2 items; Wisconsin Department of Public Instruction, 
2006), and the EXPLORE test (1 item; American College Test, 2014). For each student, the per¬ 
centage of problems answered correctly was computed. Sample items can be found in Table 1. 

Teacher Reports 

At the end of the year, teachers were administered a survey about their experience in the Alge- 
braByExample study. In one item, they were asked about the frequency with which they 
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reviewed study assignments in class. Teachers responded by selecting one of the following 
options: 0-20% of the time ; 20-40% of the time ; 40-60% of the time ; 60-80% of the time ; or 
80-100% of the time. Teacher responses were recoded into a 1 (0-20%) to 5 (80-100%) scale. 
In addition, teacher checklists were used to determine how many assignments were used in 
each of their classes. 

Instructional Manipulation 

Workbooks containing 42 assignments were provided to students in the beginning of the school 
year (see Table 2 for a list of assignment topics). Two versions of the workbook were created. 
One contained all of the treatment versions (worked example and self-explanation prompt) of 
the assignments, and the other contained all of the control versions of the assignments. Each 
control assignment contained 6-8 problems to solve that were isomorphic to those found in the 
relevant textbook chapters (the majority of assignments had eight items, however the quadratics 
assignments only had six, as each item typically takes longer to complete). Each treatment 
assignment contained the same 6-8 items, but worked examples were provided for the four left- 
hand items (in quadratics assignments, there were only three left-hand items). Most assignments 
contained two correct examples to explain and two incorrect examples to explain, and all 
assignments had at least one of each type of example. See Figures 1 and 2 for excerpts of a 
treatment and a corresponding control assignment. Assignment workbooks were collected at 
the end of the school year, and student work was coded to determine how many of the left-hand 
items and right-hand items on each assignment were sufficiently attempted by the student. In 
the AlgebraByExample assignments, left-hand items are worked examples and right-hand are 
traditional problems for students to solve on their own. Control assignments had only traditional 
problems. The number of assignments for which at least 75% of the left-hand items were 


TABLE 2 

Comprehensive List of AlgebraByExample Assignment Topics 


Absolute value 
Combining like terms 
Decimals 

Distributive property 
Fractions 

Order of operations 
Analyzing answers to 
word problems 
Graphing linear equations 
Slope 

Slope-intercept form 
Writing equations in slope- 
intercept form 

Solving 1- and 2-step equations 
Solving multi-step equations 
Solving multi-step equations 
with fractions 


Writing proportions 
Solving proportions 
Writing expressions and equations 
from words 

Solving systems of equations 
by graphing 

Solving systems of equations by 
elimination with multiplication 
Solving systems of equations 
by elimination with adding 
and subtracting 
Solving systems of equations 
by substitution 

Adding and subtracting inequalities 
Multiplying and dividing inequalities 
Graphing inequalities 
Compound inequalities 
Writing inequalities from words 


Multiplication and division 
properties of exponents 
Power to power properties of exponents 
Zero and negative properties of exponents 
Multiple properties of exponents 
Graphing tables of exponents 
Growth and growth rate 
Decay and decay rate 
Decay and growth rate 
Adding and subtracting polynomials 
Multiplying monomials 
by polynomials 
Factoring binomials 
Multiplying binomials 
Quadratic formula 
Solving quadratics by factoring 
Solving quadratics using square root 
Graphing quadratic functions 
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attempted and the number of assignments for which at least 75% of the right-hand items were 
attempted was computed for each student. 

PROCEDURE 

Soon after the school year started and before any assignments were administered, all classes 
were administered the prior-knowledge test and the motivation survey. During the school year, 
teachers administered study assignments when they were relevant to the content being covered 
in their classes; teachers were asked to make sure they used the same assignment topics in both 
treatment and control classrooms. Student assignment workbooks were collected at the end of 
the school year. At the end of the year, all classes were administered the conceptual posttest, 
procedural posttest, and the standardized-test item measure during a single class period. Dei- 
dentified demographic data were provided by the school districts. 


RESULTS 

The research team conducted analyses in which students were nested in classrooms, with a suf¬ 
ficient number of clusters (25) to warrant the use of multilevel modeling (Maas & Hox, 2(X)5). 
Thus, all analyses were conducted with hierarchical linear modeling (Raudenbush, Bryk, & 
Congdon, 2004). 

All analyses used student data at Level 1 and classroom data at Level 2. Level 1 included the 
student’s prior-knowledge scores and posttest conceptual, procedural, and standardized test 
item scores. Level 1 also included pretest interest and competence expectancy scores, the 
student’s URM status (URM vs. non-URM), and the number of assignments on which students 
demonstrated sufficient effort on the left-hand items and the number on which they demon¬ 
strated sufficient effort on the right-hand items. Level 2 included whether or not the classroom 
was randomly assigned to the treatment condition, the proportion of the classroom that was 
comprised of URM students, the number of assignments the teacher used in the class, and the 
frequency with which the teacher reported reviewing the study assignments in class. Two stu¬ 
dents completed the posttest conceptual and procedural test, but not the posttest standardized 
items test. To include the maximum number of students in each analysis, students with missing 
data were excluded when running individual analyses. Descriptive statistics may be found in 
Table 3. 


Effectiveness of the AlgebraByExample Intervention 

To determine whether worked example assignments improved learning for algebra students, and 
whether there are differences in benefit based on student-level individual differences, we con¬ 
ducted a series of two-level hierarchical linear models with individual students nested in class¬ 
rooms. For conceptual, procedural, and standardized item posttest scores, we first tested an 
empty model and determined that 55% of the variance in posttest conceptual scores, 41% of the 
variance in posttest procedural scores, and 44% of the variance in posttest standardized item 
scores was between classrooms, supporting the need for multilevel modeling. Because all 
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FIGURE 1 Excerpt from the treatment version of the “Writing Equations in Slope-Intercept Form” assignment. 
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FIGURE 2 Excerpt from the control version of the “Writing Equations in Slope-Intercept Form” assignment. 
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TABLE 3 

Descriptive Statistics for Level 1 and Level 2 Variables 



N 

Mean 

Standard 

Deviation 

Minimum 

Maximum 

Student level variables: Level 1 

URM 

380 

0.63 

0.48 

0.00 

1.00 

Prior knowledge score 

380 

0.53 

0.19 

0.01 

0.99 

Pretest competence expectancy score 

380 

5.86 

1.19 

1.00 

7.00 

Pretest interest score 

380 

4.40 

1.65 

1.00 

7.00 

Assignment effort—Right-hand 

380 

18.80 

11.09 

0.00 

40.00 

Assignment effort—Left-hand 

380 

18.30 

11.14 

0.00 

40.00 

Posttest procedural score 

380 

0.25 

0.24 

0.00 

1.00 

Posttest conceptual score 

380 

0.50 

0.22 

0.00 

0.98 

Posttest standardized item score 

378 

0.49 

0.34 

0.00 

1.00 

Classroom level variables: Level 2 

Treatment 

25 

0.52 

0.51 

0.00 

1.00 

Proportion URM (class composition) 

25 

0.50 

0.33 

0.07 

1.00 

Assignments used 

25 

26.32 

8.47 

15.00 

40.00 

Reviewed 

25 

3.00 

1.04 

1.00 

5.00 


Note. URM = Underrepresented minority. 


intraclass correlations were significant, we report the estimates for robust standard errors (Rau- 
denbush & Bryk, 2002). The focal variables in the subsequent models were treatment (whether 
the students received the worked example assignments or not) and the interactions between treat¬ 
ment and prior knowledge and treatment and URM. Pretest scores on the interest and competence 
expectancy scales, effort on left-hand and right-hand items, as well as the percentage of URM per 
classroom, the number of assignments used in each classroom, and the frequency with which 
assignments were reviewed in the classroom are included in each model as control variables. 


Conceptual Knowledge 

As shown in Table 4, after controlling for interest and competence expectancy at pretest, 
assignment use, assignment effort, and URM status, model 1 indicated that students with higher 
prior knowledge scored higher on the posttest conceptual test; students in classrooms where the 
teacher reported reviewing the study assignments frequently also had higher posttest conceptual 
scores. Model 1 also yielded a trend toward a main effect of treatment, with students in the 
treatment group outperforming those in the control group. When the interactions between treat¬ 
ment and both prior knowledge and URM were included, model 2 yielded a significant main 
effect of treatment (see Figure 3) and a significant interaction between treatment and prior 
knowledge, indicating that the influence of condition on conceptual posttest scores varied by 
students’ prior knowledge levels. As shown in Figure 4, the benefit of being in the treatment 
group was especially strong for students with lower prior knowledge. The Bayesian Information 
Criterion (B1C) decreased slightly from model 1 (—376) to model 2 (—377) indicating a slightly 
better fit of model 2 to the data. 
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TABLE 4 

Predictors of Posttest Conceptual Scores 



Model 1 Main Effects Model 

Model 2 Interaction Model 

Fixed effects 

Student-level 

URM 

-.01 (.02) 

.01 (.02) 

Classroom-level 

Intercept 

.34** (.05) 

.33** (.05) 

Treatment 

.06 f (.03) 

.06* (.03) 

Proportion URM (class composition) 

-.12** (.04) 

-.11** (.04) 

Assignments used 

.00 (.00) 

.00 (.00) 

Frequency of review 

.04* (.02) 

.04* (.02) 

Treatment x Prior knowledge interaction 

— 

-.34** (.12) 

Treatment x URM interaction 

— 

-.04 (.03) 

Random effects 

Student-level 

Prior knowledge score 11 

.21* (.08) 

.38** (.08) 

Pretest interest score 3 

.00 (.01) 

.00 (.01) 

Pretest competence expectancy score 3 

-.01 (.01) 

-.01 (.01) 

Assignment effort—Right-hand 3 

.01 (.00) 

.01 (.00) 

Assignment effort—Left-hand 3 

.00 (.00) 

.00 (.00) 

Variance components 

Classroom-level 

.0048 

.0052 

Student-level 

.0131 

.0128 

Proportion reduction in variance (from Model 1) 

— 

.0191 


Note . URM = Underrepresented minority. (SE). — = not included in model. 
“Predictor is grand-mean centered. 

V < 10; *p < .05; **p < .0\. 


Procedural Knowledge 

Parallel analyses were conducted to understand predictors of posttest procedural scores. As 
shown in Table 5, after controlling for interest and competence expectancy at pretest, assign¬ 
ment use, assignment effort, and URM status, students with higher prior knowledge scored 
higher on the posttest procedural test; students in classrooms where the teacher reported review¬ 
ing the study assignments frequently also had higher posttest procedural scores. Model 1 also 
yielded a trend toward a main effect of treatment, with students in the treatment group outper¬ 
forming those in the control group. Model 2 yielded a significant main effect of treatment (see 
Figure 5). The influence of treatment on procedural scores did not vary by students’ prior 
knowledge or URM status. The BIC did not decrease from model 1 (—466) to model 2 (—461). 


Standardized Test Items 

Parallel analyses were conducted to understand predictors of performance on standardized test 
items. As shown in Table 6, after controlling for interest and competence expectancy at pretest. 
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FIGURE 3 Effect of treatment on posttest conceptual scores. 



FIGURE 4 Interaction between prior knowledge and treatment on posltest conceptual scores. 
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TABLE 5 

Predictors of Posttest Procedural Scores 



Model 1 Main Effects Model 

Model 2 Interaction Model 

Fixed effects 

Student-level 

URM 

-.01 (.02) 

.01 (.03) 

Classroom-level 

Intercept 

.11* (.05) 

.10* (.05) 

Treatment 

.03* (.01) 

.06* (.03) 

Proportion URM (class composition) 

-.07** (.02) 

-.07** (.02) 

Assignments used 

.00 (.00) 

.00 (.00) 

Frequency of review 

.03** (.01) 

.03** (.01) 

Treatment x Prior knowledge interaction 

— 

.06 (.09) 

Treatment x URM interaction 

— 

-.03 (.04) 

Random effects 

Student-level 

Prior knowledge score 11 

.40** (.05) 

.37** (.06) 

Pretest interest score 3 

.00 (.00) 

.00 (.00) 

Pretest competence expectancy score 3 

.00 (.00) 

.00 (.00) 

Assignment effort—Right-hand 3 

.00 (.00) 

.00 (.00) 

Assignment effort—Left-hand 3 

.00 (.00) 

.00 (.00) 

Variance components 

Classroom-level 

.0028 

.0029 

Student-level 

.0110 

.0109 

Proportion reduction in variance (from Model 1) 

— 

.0046 


Note . URM = Underrepresented minority. (SE). — = not included in model. 
“Predictor is grand-mean centered. 

V < 10; *p < .05; **p<.01. 


assignment use, assignment effort, and URM status, both model 1 and model 2 yielded a main 
effect of treatment on standardized test item scores, such that students in the treatment group 
tended to have higher standardized test item scores at posttest (see Figure 6). Both models also 
indicated that students with higher prior knowledge scored higher on the standardized test 
items, and students in classrooms where the teacher reported reviewing the study assignments 
frequently also had higher standardized test item scores. The influence of treatment on standard¬ 
ized test item scores did not vary by students’ prior knowledge level or URM status. The BIC 
did not decrease from model 1 (24) to model 2 (28). 


DISCUSSION 

AlgebraByExample provides a boost in performance, with the greatest impact on students at 
the lower end of the performance distribution. On the researcher-designed assessment of 
conceptual understanding, treatment students in the lower half of the performance distribu¬ 
tion outscored comparable control students by approximately 10 percentage points. Treat¬ 
ment students overall scored 7 percentage points higher on a test composed entirely of 
released items from the state standardized tests, and 5 percentage points higher on the con¬ 
ceptual posttest. Procedural posttest scores were also 4 percentage points higher in the 
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TABLE 6 

Predictors of Posttest Standardized Item Scores 



Model 1 Main Effects Model 

Model 2 Interaction Model 

Fixed effects 

Student-level 

URM 

—.05' (.03) 

-.01 (.05) 

Classroom-level 

Intercept 

.45** (.07) 

.43** (.08) 

Treatment 

.06* (.03) 

.13* (.06) 

Proportion URM (class composition) 

-.16** (.03) 

-.15** (.03) 

Assignments used 

-.01* (.00) 

-.01* (.00) 

Frequency of review 

.06** (.02) 

.06** (.02) 

Treatment x Prior knowledge interaction 

— 

-.09 (.18) 

Treatment x URM interaction 

— 

-.09 (.06) 

Random effects 

Student-level 

Prior knowledge score 3 

.53** (.09) 

.59** (.16) 

Pretest interest score 3 

.00 (.01) 

.00 (.01) 

Pretest competence expectancy score 3 

—.02* (.01) 

-.02* (.01) 

Assignment effort—Right-hand 3 

.00 (.01) 

.00 (.01) 

Assignment effort—Left-hand 3 

.01 (.01) 

.01 (.01) 

Variance components 

Classroom-level 

.0032 

.0033 

Student-level 

.0406 

.0400 

Proportion reduction in variance (from Model 1) 

- 

.0145 


Note. URM = Underrepresented minority. (SE). — = not included in model. 
“Predictor is grand-mean centered. 

*p < .10; *p < .05; **p < .01. 



FIGURE 5 Effect of treatment on posttest procedural scores. 
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FIGURE 6 Effect of treatment on standardized item scores. 


treatment group, even though control students had twice the practice solving problems on 
the assignments. Of equal significance, AlgebraByExample achieved these gains with an 
intervention that meets all the constraints imposed by the districts: it targets all students, it 
can be used with any existing Algebra 1 curriculum, and it is an asset that teachers can use 
with minimum disruption to their practice. Moreover, study teachers reported that students 
using AlgebraByExample required less teacher support to complete assignments than those 
using control assignments, and reported rethinking their own practice in response to 
students’ positive experiences with worked examples. 

The SERP-MSAN partnership departed from traditional education research studies in the 
extent to which the intervention was both shaped by the needs and the input of practitioners and 
tested with rigorous research methodologies, minimizing limitations from both practitioner and 
research perspectives. For teachers, limitations included constraints of the research study, such 
as not being able to use the assignments as homework (to minimize data loss), grade the assign¬ 
ments (except for completion), or use the AlgebraByExample assignments in all of their 
classrooms. 

From a research perspective, the limitations of the study included a sample restricted to sub¬ 
urban or small urban areas and regular, nonhonors classrooms. The impact is not known for stu¬ 
dents in large urban districts, or in honors or remedial classrooms. In addition, although a 
within-teacher design controlled for teacher variables, it is possible that there was some con¬ 
tamination across classrooms. 

The limitations of the study, however, are dwarfed by its benefits. The MSAN practitioners 
accomplished their goal of finding a practical intervention that improved African American and 
Latino students’ understanding of algebra. Researchers also accomplished university goals, 
including tenure-related research and graduate student dissertations. And the SERP 
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organization accomplished its goals of generating knowledge and products that are of value to 
the field of education more broadly. The assignments are not only accessible freely (math.serp- 
media.org/algebra_by_example/), but they can be used without resource-intensive professional 
development or costly changes in curriculum. 

Although the SERP-MSAN partnership was able to successfully navigate around obstacles 
to partnership success, it is worth noting that current funding mechanisms are particularly chal¬ 
lenging in the long delays imposed by application and review schedules. Because MSAN is a 
partnership of many districts, the changing circumstances that led to withdrawal from the col¬ 
laboration by some districts was successfully resolved when other districts joined the study, an 
affordance of the MSAN alliance that is unique. Although opportunities for funding research- 
practice partnerships are expanding, timing is likely to remain problematic unless expedited 
review and opportunities to respond quickly with revisions are put in place for partnership work 
that is already in motion. 
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